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Abstract: Mammography is an important research field. Mammography Image classification is an area of 
interest to most of the researchers today. The aim of this paper is to detect the Mammography image for its 
malignancy. Different methods can be used to detect the malignancy. This paper represents GLDM andGabor 
feature extraction methods along with SVM and K-NN classifiers.Experiments were conducted on MIAS 
database. The results show that combination of GLDM feature extractor with SVM classifier is found to give 
appropriate results. 
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I. INTRODUCTION 

Cancer is uncontrolled growth of cells. Breast cancer is the uncontrolled growth of cells in the breast 
region. Breast cancer is the second leading cause of cancer deaths in women today. Early detection of the cancer 
can reduce mortality rate. Mammography has reported cancer detection rate of 70-90% which means 10-30% of 
breast cancers are missed with mammography [1]. Early detection of breast cancer can be achieved using Digital 
Mammography, typically through detection of Characteristics masses and/or micro calcifications .A 
mammogram is an x-ray of the breast tissue which is designed to identify abnormalities. Studies have shown 
that radiologists can miss the detection of a significant proportion of abnormalities in addition to having high 
rates of false positives .Therefore, it would be valuable to develop a computer aided method for mass/tumour 
classification based on extracted features from the Region of Interest (ROl) in mammograms [3]. Pattern 
recognition in image processing requires the extraction of features from regions of the image, and the processing 
of these features with a pattern recognition algorithm. Features are nothing but observable patterns in the image 
which gives some information about image. For every pattern classification problem, the most important stage is 
Feature Extraction. The accuracy of the classification depends on the Feature Extraction stage. The motto 
behind computer aided analysis is not to replace the Radiologists but to have a second opinion and thus provide 
an efficient support in decision making process of the radiologist. Much research has been done in 
mammography towards detecting one or more abnormal structures: circumscribed masses [5], speculated lesions 
[6] and micro-calcifications [4]. Other researchers have focused on classifying the breast lesions as benign or 
malignant. There are different featm^e descriptors such as GLDM, (Gray Level Difference Method), LBP (Local 
Binary Patterns), GLRLM(Grey level Run Length Method), Harralick, Gabor textm^e features and there are 
classification methods such as SVM,C4.5,K-NN Classifier. 

In this paper we have used a GLDM and Gabor feature extraction method over set of mammography images and 
then tested their performance on SVM and K-NN classification algorithms.The paper is organised as follows 
with section 2 gives explanation about the pre-processing stage where as section 3 describes the feature 
extraction methods which are used in the experiment. Section 4 comes up with the overview of classification 
methods. Section 5 provides explanation of experiments and brief discussion of results. Section 6 gives 
conclusion derived from this work. 

II. PREPROCESSING 

Pre-processing stage is a step used to increase image quality of Mammograms as they are very difficult to 
interpret .An histogram equalization can be used to adjust the image contrast so that anomalies can be better 
emphasized. 

III. FEATURE EXTRACTION 

Feature extraction involves simplifying the amount of resources required to describe a large set of data 
accurately. In image processing, a different set of features can be used to extract the visual information from a 
given image. Because digital mammography images are specific, not all visual features can be used to correctly 
describe the relevant image patch. All classes of suspected tissue are different by their shape and tissue 
composition. This is why the most suitable visual featm^e descriptors for this kind of images are based on shape 
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and texture. We can use different feature extraction methods and test them on variety of classifiers. We are 

using GLDM featm^e descriptor. 

GLDM 

The GLDM method calculates the Gray level difference method Probability Density functions for the 
given image. This technique is usually used for extracting statistical texture features of a digital mammogram. 
From each density functions five texture features are defined: Contrast, Angular Second moment. Entropy, 
Mean and Inverse Difference Moment. Contrast is defined as the difference in intensity between the highest and 
lowest intensity levels in an image thus measures the local variations in the grey level. Angular second moment 
is a measure of homogeneity. If the difference between gray levels over an area is low then those areas are said 
to be having higher ASM values. Mean it gives the average intensity value. Entropy is the average information 
per intensity source output. This parameter measures the disorder of an image. When the image is not texturally 
uniform, entropy is very large. Entropy is strongly, but inversely, correlated to energy. Inverse difference 
moment IDM measures the closeness of the distribution of elements in the Gray level Co-occurrence Matrix 
(GLCM) to the GLCM diagonal. To describe the Gray level difference method, let g (n,m) be the digital picture 
function. For any given displacement 6=(An, Am), where An and Am are integers, let g5 (n,m)=|g(n,m)-g(n+ 
An,m+ Am)|. Let f(|6) be the estimated probability density function associated with the possible values of g6,ie, 
f(i|6)=P(g6(n,m)=i herein our possible forms of vector 5 will be considered,(0,d),(-d,d),(d,0),(-d,d), where d is 
inter sample distance, we refer f(|6) as gray level difference density function. 

Gabor Texture Feature 

Gabor textm^e feature is a linear filter .Basically, Gabor texture feature is a group of wavelets, with each 
wavelet capturing energy at a capturing energy at a specific frequency and a specific direction. From this group 
of energy distributions the texture featm^e representing the image can be extracted.Thus a set of Gabor filters 
with different frequencies and orientations may be helpful for extracting useful featm^es from an image. Gabor 
filters have been widely used in pattern analysis applications. Frequency (scale) and orientation representations 
of Gabor filters are similar to those of the human and mammalian visual system, and they have been found to be 
particularly appropriate for textm^e analysis. 

IV. CLASSIFICATION METHODS 

There are innumerous classification methods for automated classification of samples. In this paper it's decided 
to work with most popular classification algorithm: SVM and K-NN 
Support Vector Machines 

The Support Vector machines were introduced by Vladimir Vapnik and colleagues. Support Vector 
machines (SVM's) are a relatively new learning method used for binary classification. The basic idea is to find a 
hyper plane which separates the D-Dimensional data perfectly into its two classes. However, since example data 
is often not linearly separable, SVM's introduce the notion of a kernel induced feature space which casts the 
data into a higher dimensional space where the data is separable. Namely, the primary goal of SVM classifiers is 
classification of examples that belong to one of two possible classes. 

However, SVM classifiers could be extended to be able to solve multiclass problems as well. One of the 
strategies for adapting binary SVM classifiers for solving multiclass problems is one-against-all (OvA) scheme. 
It includes decomposition of the M-class problem (M>2) into series of two-class problems. The basic concept is 
to construct M SVMs where the i-th classifier is trained to separate the class i from all other (M-1) classes. 
This strategy has a few advantages such as its precision, the possibility for easy implementation and the speed in 
the training phase and the recognition process. That is reason for its wide use. 
K-NN 

In pattern recognition, the A:-nearest neighbor algorithm (k-NN) is a method for classifying objects 
based on closest training examples in the feature space. k-NN is a type of instance-based learning, or lazy 
learning where the function is only approximated locally and all computation is deferred until classification. The 
fe-nearest neighbor algorithm is amongst the simplest of all machine learning algorithms: an object is classified 
by a majority vote of its neighbors, with the object being assigned to the class most common amongst its k 
nearest neighbors (k is a positive integer,typically small). If ^ = 1, then the object is simply assigned to the class 
of its nearest neighbor. The training examples are vectors in a multidimensional feature space,each with a class 
label. The training phase of the algorithm consists only of storing the featm^e vectors and class labels of the 
training samples. In the classification phase, is a user-defined constant, and an unlabeled vector (a query or test 
point) is classified by assigning the label which is most frequent among the k training samples nearest to that 
query point. 
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V. RESULT 

Tablel and figl depicts the classification precision of two classifiers in two class problem with the 
GLDM descriptor.The results show that GLDM descriptor for various displacements with SVM classifier 
provides the best classification accuracy of 95.83% .Thus as displacements in GLDM are increased we get the 
best classification accuracy.ln case of GLDM descriptor with the K-NN classifiers results seen are with 
maximum accuracy of 50% which is not at par with the SVM and GLDM combination of results with 95.83% 
.Thus we see the GLDM and SVM combination giving better results than GLDM and K-NN combination. 



Sr.no. 


Displacement 


SVM 


K-NN 


1 


8 


62.50 


41.67 


2 


10 


75.00 


45.83 


3 


12 


70.83 


50.00 


4 


16 


87.50 


41.67 


5 


20 


95.83 


33.33 



Table I.Classification precision for the two-class problem 
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Fig 1 . Displacement Verses Percentage Accuracy 

Table 11 and fig2 depicts the classification precision of two classifiers in two class problem with the Gabor 
texture feature descriptor.The results show that Gabor texture feature descriptor for various orientations with 
SVM classifier provides the best classification accuracy of 71.83% .Thus as orientations in Gabor are increased 
we get the best classification accuracy. In case of Gabor texture feature descriptor with the K-NN classifiers 
results seen are with maximum accuracy of 58.33% which is not at par with the SVM and Gabor texture feature 
descriptor combination of results with 71.83% .Thus we see the Gabor and SVM combination giving better 
results than Gabor and K-NN combination. 



Sr.no. 


Orientation 


SVM 


K-NN 


1 


1 


66.67 


45.83 


2 


2 


70.83 


58.33 


3 


3 


62.5 


37.5 


4 


4 


71.83 


50.0 



Table ll.Classification precision for the two-class problem 
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Fig 2. Orientation Verses Percentage Accuracy 

VI. CONCLUSION 

Digital mammography is the most common method or early breast cancer detection. Automated analysis of these 
images is very important, since manual analysis of these images is slow.Today manual analysis of only eight 
slides of mammography images per day is permitted for the radiologists it being very costly and inconsistent. 
In this paper we made analysis on two classifiers,using two different descriptors for feature 
extraction.According to the examination, we can conclude that the best classification accuracy was achieved,in 
the case of GLDM descriptor. 
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